
The Biotechnology Systems Branch of the Scientific and Technical Information 
Center (STIC) detected errors when processing the following computer readable 
form: 



Application Serial Number: O^} j^/^ . /S^ 



Source: 



Date Processed by STIC: ^/l^/^oOl 

THE ATTACHED PRINTOUT EXPLAINS DETECTED ERRORS. 
PLEASE FORWARD THIS INFORMATION TO THE APPLICANT BY EITHER: 

1) INCLUDING A COPY OF THIS PRINTOUT IN YOUR NEXT COMMUNICATION TO THE 
APPLICANT, WITH A NOTICE TO COMPLY or, 

2) TELEPHONING APPLICANT AND FAXING A COPY OF THIS PRINTOUT, WITH A 
NOTICE TO COMPLY 

FOR CRF SUBMISSION QUESTIONS, PLEASE CONTACT MARK SPENCER, 703-308-421Z 

V/^ FOR SEQUENCE RULES INTERPRETATION, PLEASE CONTACT ROBERT WAX, 703-308-4216. 

PATENTIN 2.1 e-mail help: patin21help@uspto.gov or phone 703-306-4119 (R. Wax) 
PATENTIN 3.0 e-mail help: patin3help(aiuspto.gov or phone 703-306-4119 (R. Wax) 

TO REDUCE ERRORED SEQUENCE LISTINGS, PLEASE USE THE CHECKER 
\i VERSION 3.0 PROGRAM . ACCESSIBLE THROUGH THE U. S. PATENT AND 

TRADEMARK OFFICE WEBSITE. SEE BELOW: 



Checker Version 3.0 

The Checker Version 3.0 appHcation is a state-of the-art Windovys based software program 
employing a logical and intuitive user-interface to check whether a sequencejisting is in 
compliance with format and content rules. Checker Version 3.0 Nvorks for sequence listings 
generated for the original version of 37 CFR'§§ 1.821 - 1.825 effect iye October I, 1990 (old 
rules) and the revised version (new rules) effective July 1, 1998 as well as World Intellectual 
Property Organization ^(WIPO) Standard ST.25 . 

Checker Version 3.0 replaces the previous DOS-based versiort of Checker, and is Y2Kr . . 
compliant. Checker allows public users to check sequence listings in Computer Readable fbrm 
(CRF) before submitting them to the United States Patent and Trademark Office (USPTTO). 
Use of Checker prior to filing the sequence listing is expected to result in fewer errored sequence 
listings, thus saving time and money. 



Checker Version 3.0 can be down loaded from the USPTO website at the foUowing address 

http://www.uspto.gov/web/offices/pac/checker 



Raw Sequence Lis(In{ Error Summary 



ERROR DETECTED 



SERIAL NUMBER: 



SUGGESTED CORRECTION 

ATTnI NEW RULES CASES: PLEASE DISREGARD ENGLISH -ALPHA" HEADERS, WHICH WERE INSERTED DY PTO SOFTWARE 



1 Wrapped Nucleics 

Wrapped Aminos 



10 



It 



12 



The numbcrAexl al the end of each line "wrapped" down to the next line. This may occur if your file 
was retrieved in a word proces5or after creating it Please adjust your right margin to .3; this will 
prevent '•wrapping." 



Invalid Line Length The rules require that a line not exceed 72 characters in length, "this includes white spaces. 



_^MisaIigncd Anaino 
Numbering 

Non- ASCII 



Variable Length 



_Paient!n 2.0 
"bug" 



7 Skipped Sequences 

(OLD RULES) 



Skipped Sequences 
"(NEW RULES) 



_U$c of n't or Xaa's 
(NEW RULES) 



Jnvalid <213> 
Response 

Use of<220> 



_PatcntIn 2.0 
"bug" 



The numbering under each 5* amino acid is misaligned. Do not use tab codes between numbers; 
use space characters, instead. 

The submitted file was not saved in ASC!I(DOS) text, as required by the Sequence Rules. Please 
ensure your subsequent submission b saved In ASCII text. 

Sequencc<s) contain n's or Xaa's representing more thaii one residue. Per Sequence Rules, 

each n or Xaa can only represent a single residue. Please present the maximum number of each 
residue having variable length and indicate in the <220>-<223> section that some may be missing. 

A "bug" in Palcntin version 2.0 has c^ed Aft <220>-<223> section to be missing from amino acid 

sequerK:es(s) . Normally, Patentin wou I automatical I ygcncni I c this section from the 

previously coded nucleic acid sequence. Please manually copy the relevant <220>-<223> section to 
the subsequent amino acid sequence. This applies to the mandatory <220>-<223> sections for 
Artinclal or Unknown sequences. 

Sequence(s) missing. If intentional, please insert the following lines for each skipped sequence: 

(2) INFORMATION FOR SEQ ID NO:X: (insert SEQ ID NO where "X" is shown) 
(i) SEQUENCE CHARACTERISTICS; (Do not insert any subheadings under this heading) 

(xi) SEQUENCE DESCRIPTION:SEQ ID NO:X: (insert SEQ ID NO where "X" is shown) 
This sequence is intentionally skipped 

Please also adjust the "(ii) NUMBER OF SEQUENCES:" response to include the skipped sequences. 



Sequcnce<s) 



missing. If Intentional, please insert the following lines for each skipped sequence. 



<2I0> sequence id number 
<400> sequence id number 
000 

Use of n's and/or Xaa's have been detected in the Sequence Listing. 

Per 1.823 of Sequence Rules, use of <220>-<223> is MANDATORY if n's or Xaa's arc present 

In <220> to <223> section, please explain location of n or Xaa, and which residue n or Xaa represents. 

Per 1.823 of Sequence Rules, the only valid <213> responses are: Unknown, Artificial Sequence, or 
scientific name (Genus/species). <220>-<223> section is required when <2I3> response is Unknown or 
is Artificial Sequence 



Sequencers) 



missing the <220> "Feature" and associated numeric identifiers and responses. 



Use of <220> to <223> is MANDATORY if <213> "Organism" response is "Artificial Sequence" or 
"Unknown." Please explain source of genetic material in <220> to <223> section. 
(See "Federal Register," 06/01-/1998, Vol. 63. No. 104, pp. 29631-32) (Sec. 1.823 of Sequence Rules) 

Please do not use "Copy to Disk" function of Patentln version 2.0. This causes a corrupted file, 
resuhing in missing mandatory numeric identifiers and responses (as indicated on raw. sequence 
listing). Instead, please use "File Manager" or any other manual means to copy file to floppy disk. . 



13 Misuse of n " n can only be used to represent a single nud^tlde in a nucleic acid sequence. N is not used to represent 

any value not specifically a nucleotide. 
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PCT09 



RAW SEQUENCE LISTING 

PATENT APPLICATION: US/09/913,159 



DATE: 08/23/2001 
TIME: 14:18:57 



Input Set : A:\P057760 ^app 

Output Set: N:\CRF3\08162001\I913159.raw 



5 


<110> 


7 


<120> 




<130> 




<140> 




<141> 


14 


<150> 


15 


<151> 


17 


<160> 


19 


<170> 



APPLICANT: Strathmann AG & Co. 
TITLE OF INVENTION: Virus "Vaccine 
FILE REFERENCE: P057760 

CURRENT APPLICATION NUMBER: US/09/913,159 
CURRENT FILING DATE: 2001-08-10 

PRIOR APPLICATION NUMBER: 199 07 485.2 
PRIOR FILING DATE: 1999-02-12 
NUMBER OF SEQ ID NOS : 12 
SOFTWARE: Patentln Ver. 2.1 



Does Not Comply 
Corrected Diskette Needed 



ERRORED SEQUENCES 



W- 
E- 
W- 
E- 
W- 
E- 
W- 
E- 
E- 
E- 
E- 
E- 
W- 
E- 
W- 
E- 
E- 
E- 
W- 
E- 
W- 
E- 
E- 
E- 
W-' 
E-- 

E-- 



21 <210> SEQ ID NO: 1 

22 <211> LENGTH: 9709 

23 <212> TYPE: DNA 

24 <213> ORGANISM: Human 

26 <400> SEQUENCE: 1 

27 ta qaaq aact aatttggtcc 
2B(ck 6q\ 



immunodeficiency virus 
caaaaaagac aagagatcct tgatctgtgg 



at.ctacca 



29 rnrri fvggct.fi ..gftrrrtgr^t 
30<^tccac 12 o3 



tggcagaact acacaccagg gccagggatc 




31 


tgacctttgg 


atggtgcttc 


aagttagtac 


cagttgaacc 


agagcaagta 


32 


gaggcca 180 








33 


aataaggaga 


gaagaacagc 


ttgttacacc 


ctatgagcca 


gcatgggatg 


34 


gacccgg 240 








35 


agggagaagt 


attagtgtgg 


aagtttgaca 


gcctcctagc 


atttcgtcac 


36 


atggcccgag 


300 








37 


agctgcatcc 


ggagtactac 


aaagactgct 


gacatcgagc 


tttctacaag 


38 


ggactttccg 


360 








39 


ctggggactt 


tccagggagg 


tgtggcctgg 


gcgggactgg 


ggagtggcga 


40 


gat 4 20 










41 


gctacatata 


agcagctgct 


ttttgcctgt 


actgggtctc 


tctggttaga 


42 


ga 480 










43 


gcctgggagc 


tctctggcta 


actagggaac 


ccactgctta 


agcctcaata 


44 


aagcttgcct 


540 








45 


tgagtgctca 


aagtagtgtg 


tgcccgtctg 


ttgtgtgact 


ctggtaacta 


46 


gatccctc 600 








47 


agaccctttt 


agtcagtgtg 


gaaaatctct 


agcagtggcg 


cccgaacagg 


48 


gaaag 660 










49 


cgaaagtaaa 


gccagaggag 


atctctcgac 


gcaggactcg 


gcttgctgaa 


50 


gcgcgcacgg 


720 








51 


caagaggcga 


ggggcggcga 


ctggtgagta 


cgccaaaaat 


tttgactagc 


52 


ga 780 










53 


aggagagaga 


tgggtgcgag 


agcgtcggta 


ttaagcgggg 


gagaattaga 


54 


atgggaa 840 








55 


aaaattcggt 


taaggccagg 


gggaaagaaa 


caatataaac 


taaaacatat 



ga^ 
gag£} 



gccctca^ 
ccagatct0 

gactt^ 



ggaggcta^ 
taa^ 



(fy\ 



flle://C:\CRF3\Outholc^Vs^I913159.htm 



8/23/01 



RAW SEQUENCE LISTING DATE: 08/23/2001 

PATENT APPLICATION: US/09/913,159 TIME: 14:18:57 



Input Set : A:\P057760.app 

Output Set: N:\CRF3\08162001\I913159.raw 

E--> 56 tatgggca 900 

E--> 57 agcagggagc tagaacgatt cgcagttaat cctggccttt tagagacatc 
E--> 58 agaaggctgt 960 

W--> 59 agacaaatac tgggacagct acaaccatcc cttcagacag gatcagaaga acttagat- 
E--> 60 ca 1020 

W--> 61 ttatataata caatagcagt cctctattgt gtgcatcaaa ggatagatgt aaaaga- 
E--> 62 cacc 1080 

W--> 63 aaggaagcct tagataagat agaggaagag caaaacaaaa gtaagaaaaa ggcacag- 
E--> 64 caa 1140 

W--> 65 gcagcagctg acacaggaaa caacagccag gtcagccaaa attaccctat agtgca- 
E--> 66 gaac 1200 

E--> 67 ctccaggggc aaatggtaca tcaggccata tcacctagaa ctttaaatgc 
E"--> 68 atgggtaaaa 1260 

W--> 69 gtagtagaag agaaggcttt cagcccagaa gtaataccca tgttttcagc attatca- 
E--> 70 gaa 1320 

W--> 71 ggagccaccc cacaagattt aaataccatg ctaaacacag tggggggaca tcaag- 
E--> 72 cagcc 1380 

E--> 73 atgcaaatgt taaaagagac catcaatgag gaagctgcag aatgggatag 
E--> 74 attgcatcca 1440 

W--> 75 gtgcatgcag ggcctattgc accaggccag atgagagaac caaggggaag tgacatag- 
E--> 76 ca 1500 

E--> 77 ggaactact.a gtacccttca ggaacaaata ggatggatga cacataatcc 
E--> 78 acctatccca 1560 

W--> 79 g'taggagaaa tctataaaag atggataatc ctggga'ttaa ataaaatagt aa- 
E--> 80 gaatgtat 1620 

W--> 81 agccctacca gcattctgga cataagacaa ggaccaaagg aaccctttag agac- 
E--> 82 tatgta 1680 

W--> 83 gaccgattct ataaaactcl: aagagccgag caagcttcac aagaggtaaa aa- 
E--> 84 attggatg 1740 

W--> 85 acagaaacct tgttggtcca aaatgcgaac ccagattgta agactatttt aaaag- 
E--> 86 cattg 1800 

E--> 87 ggaccaggag cgacactaga agaaatgatg acagcatgtc agggagtggg 
E--> 88 gggacccggc 1860 

W--> 89 cataaagcaa gagttttggc tgaagcaatg agccaagtaa caaatccagc tacca- 
E--> 90 taatg 1920 

E--> 91 atacagaaag gcaattttag gaaccaaaga aagactgtta agtgtttcaa 
E--> 92 ttgtggcaaa 1980 

W--> 93 gaagggcaca tagccaaaaa ttgcagggcc cctaggaaaa agggctgttg gaa- 
E--> 94 atgtgga 2040 

W--> 95 aaggaaggac accaaatgaa agattgtact gagagacagg ctaatttttt agggaa- 
E--> 96 gate 2100 

W--> 97 tggccttccc acaagggaag gccagggaat tttcttcaga gcagaccaga gccaa- 
E--> 98 cagcc 2160 

W--> 99 ccaccagaag agagcttcag gtttggggaa gagacaacaa ctccctctca gaagcag- 
E--> 100 gag 2220 

E--> 101 ccgatagaca aggaactgta tcctttagct tccctcagat cactctttgg 
E--> 102 cagcgacccc 2280 

W--> 103 tcgtcacaat aaagataggg gggcaattaa aggaagctct attagataca ggagca- 
E--> 104 gatg 2340 



file://C:\CRF3\Outhold\VsrI913159.htm 



RAW SEQUENCE LISTING DATE: 08/23/2001 

PATENT APPLICATION: US/09/913,159 TIME: 14:18:57 

Input Set : A:\P057760.app 

Output Set: N:\CRF3\08162001\I913159.raw 



E- 


-> 


105 


atacagtatt 


agaagaaatg 


aatttgccag 


gaagatggaa 


accaaaaatg 




E- 


-> 


106 


atagggggaa 


2400 










W- 


-> 


107 


ttggaggttt 


tatcaaagta 


ggacagtatg 


atcagatact 


catagaaatc 


tgcggaca- 


E- 


-> 


108 


ta 2460 












W- 


-> 


109 


aagctatagg tacagtatta 


gtaggaccta 


cacctgtcaa 


cataattgga 


agaa- 


E- 


-> 


110 


atctgt 2520 










W- 


-> 


111 


tgactcagat 


tggctgcact 


ttaaattttc 


ccattagtcc 


tattgagact 


gtaccag- 


E- 


-> 


112 


taa 2580 












W- 


-> 


113 


aattaaagcc 


aggaatggat 


ggcccaaaag 


ttaaacaatg 


gccattgaca 


gaagaa- 


E- 


-> 


114 


aaaa 2640 












W- 


-> 


115 


taaaagcatt 


agtagaaatt 


tgtacagaaa 


tggaaaagga 


aggaaaaatt 


tcaaaa- 


E- 


-> 


116 


attg 2700 












W- 


-> 


117 


ggcctgaaaa 


tccatacaat 


actccagtat 


ttgccataaa 


gaaaaaagac 


agtactaa- 


E- 


-> 


118 


at 2760 












E- 


-> 


119 


ggagaaaatt 


agtagatttc 


agagaactta 


ataagagaac 


tcaagatttc 




E- 


-> 


120 


tgggaagttc 


2820 










E- 


-> 


121 


aattaggaat 


accacatcct 


gcagggttaa 


aacagaaaaa 


atcagtaaca 




E- 


-> 


122 


gtactggatg 


2880 










W- 


-> 


123 


tgggcgatgc 


atatttttca 


gttcccttag 


ataaagactt 


caggaagtat 


actgcatt- 


E- 


-> 


124 


ta 2940 












E- 


-> 


125 


ccatacctag 


tataaacaat 


gagacaccag 


ggattagata 


tcagtacaat 




E- 


-> 


126 


gtgcttccac 


3000 










W- 


-> 


127 


agggatggaa 


aggatcacca 


gcaatattcc 


agtgtagcat 


gacaaaaatc 


tta- 


E- 


-> 


128 


gagcctt 3060 










W- 


-> 


129 


ttagaaaaca 


aaatccagac 


atagtcatct 


atcaatacat 


ggatgatttg 


tatgtag- 


E- 


-> 


130 


gat 3120 












W- 


-> 


131 


ctgacttaga 


aatagggcag 


catagaacaa 


aaatagagga 


actgagacaa 


catctgtt- 


E- 


-> 


132 


ga 3180 












E- 


-> 


133 


ggtggggatt taccacacca 


gacaaaaaac 


atcagaaaga 


acctccattc 




E- 


-> 


134 


ctttggatgg 


3240 










W- 


-> 


135 


gttatgaact 


ccatcctgat 


aaatggacag 


tacagcctat 


agtgctgcca 


gaaaagga- 


E- 


-> 


136 


ca 3300 












W- 


-> 


137 


gctggactgt 


caatgacata 


cagaaattag 


tgggaaaatt 


gaattgggca 


agtca- 


E- 


-> 


138 


gattt 3360 












W- 


-> 


139 


atgcagggat 


taaagtaagg 


caattatgta 


aacttcttag 


gggaaccaaa 


gcactaa- 


E- 


-> 


140 


cag 3420 












W- 


-> 


141 


aagtagtacc 


actaacagaa 


gaagcagagc 


tagaactggc 


agaaaacagg 


ga- 


E- 


-> 


142 


gattctaa 3480 










W- 


-> 


143 


aagaaccggt 


acatggagtg 


tattatgacc 


catcaaaaga 


cttaatagca 


gaaataca- 


E- 


-> 


144 


ga 3540 












W- 


-> 


145 


agcaggggca 


aggccaatgg 


acatatcaaa 


tttatcaaga 


gccatttaaa 


aatct- 


E- 


-> 


146 


gaaaa 3600 












W- 


-> 


147 


caggaaaata 


tgcaagaatg 


aagggtgccc 


acactaatga 


tgtgaaacaa 


ttaaca- 


E- 


-> 


148 


gagg 3660 












W- 


-> 


149 


cagtacaaaa 


aatagccaca 


gaaagcatag 


taatatgggg 


aaagactcct 


aaatt- 


E- 


-> 


150 


taaat 3720 












E- 


-> 


151 


tacccataca 


aaaggaaaca 


tgggaagcat 


ggtggacaga 


gtattggcaa 




E- 


-> 


152 


gccacctgga 


3780 










W- 


-> 


153 


ttcctgagtg ggagtttgtc 


aatacccctc 


ccttagtgaa 


gttatggtac 


cagttaga- 
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RAW SEQUENCE LISTING DATE: 08/23/2001 

PATENT APPLICATION: US/09/913,159 TIME: 14:18:57 

Input Set : A:\P057760.app 

Output Set: N:\CRF3\08162001\I913159.raw 

E--> 154 ga 3840 

W--> 155 aagaacccat aataggagca gaaactttct atgtagatgg ggcagccaat agggaa- 
E--> 156 acta 3900 

E--> 157 aattaggaaa agcaggatat gtaactgaca gaggaagaca aaaagttgtc 
E--> 158 cccctaacgg 3960 

E--> 159 acacaacaaa tcagaagact gagttacaag caattcatct agctttgcag 
E--> 160 gattcgggat 4 020 

W--> 161 tagaagtaaa catagtgaca gactcacaat atgcattggg aatcattcaa gcacaac- 
E--> 162 cag 4080 

W--> 163 ataagagtga atcagagtta gtcagtcaaa taatagagca gttaataaaa aaggaa- 
E--> 164 aaag 4140 

W--> 165 tctacctggc atgggtacca gcacacaaag gaattggagg aaatgaacaa gta- 
E--> 166 gatgggt 4200 

W--> 167 tggtcagtgc tggaatcagg aaagtactat ttttagatgg aatagataag gcccaa- 
E--> 168 gaag 4260 

W--> 169 aacatgagaa atatcacagt aattggagag caatggctag tgattttaac ctac- 
E--> 170 cacctg 4320 

E--> 171 tagtagcaaa agaaatagta gccagctgtg ataaatgtca gctaaaaggg 
E--> 172 gaagccatgc 4380 

W--> 173 atggacaagt agactgtagc ccaggaatat ggcagctaga ttgtacacat ttagaag- 
E--> 174 gaa 4440 

E--> 175 aagttatctt ggtagcagtt catgtagcca gtggatatat agaagcagaa 
E--> 176 gtaattccag 4500 

W--> 177 cagagacagg gcaagaaaca gcatacttcc tcttaaaatt agcaggaaga tggccag- 
E--> 178 taa 4560 

E--> 179 aaacagtaca tacagacaat ggcagcaatt tcaccagtac tacagttaag 
E--> 180 gccgcctgtt 4620 

W--> 181 ggtgggcggg gatcaagcag gaatttggca ttccctacaa tccccaaagt caaggag- 
E--> 182 taa 4680 

W--> 183 tagaatctat gaataaagaa ttaaagaaaa ttataggaca ggtaagagat caggct- 
E--> 184 gaac 4740 

W--> 185 atcttaagac agcagtacaa atggcagtat tcatccacaa ttttaaaaga aa- 
E--> 186 agggggga 4800 

W--> 187 ttggggggta cagtgcaggg gaaagaatag tagacataat agcaacagac atacaa- 
E--> 188 acta 4860 

W--> 189 aagaattaca aaaacaaatt acaaaaattc aaaattttcg ggtttattac agggacag- 
E--> 190 ca 4920 

W--> 191 gagatccagt ttggaaagga ccagcaaagc tcctctggaa aggtgaaggg gcagtag- 
E--> 192 taa 4980 

W--> 193 tacaagataa tagtgacata aaagtagtgc caagaagaaa agcaaagatc at- 
E--> 194 cagggatt 504 0 

W--> 195 atggaaaaca gatggcaggt gatgattgtg tggcaagtag acaggatgag gattaa- 
E--> 196 caca 5100 

E--> 197 tggaaaagat tagtaaaaca ccatatgtat atttcaagga aagctaagga 
E--> 198 ctggttttat 5160 

W--> 199 agacatcact atgaaagtac taatccaaaa ataagttcag aagtacacat cccac- 
E--> 200 taggg 5220 

E--> 201 gatgctaaat tagtaataac aacatattgg ggtctgcata caggagaaag 
E--> 202 agactggcat 5280 
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I 



Page 5 of 21 



RAW SEQUENCE LISTING DATE: 08/23/2001 

PATENT APPLICATION: US/09/913,159 TIME: 14:18:57 

Input Set : A:\P057760.app 

Output Set: N:\CRF3\08162001\I913159.raw 

W--> 203 ttgggtcagg gagtctccat agaatggagg aaaaagagat atagcacaca agta- 
E--> 204 gaccct 5340 

W--> 205 gacctagcag accaactaat tcatctgcac tattttgatt gtttttcaga atctgcta- 
E--> 206 ta 5400 

W--> 207 agaaatacca tattaggacg tatagttagt cctaggtgtg aatatcaagc aggaca- 
E--> 208 taac 5460 

W--> 209 aaggtaggat ctctacagta cttggcacta gcagcattaa taaaaccaaa acagataa- 
E--> 210 ag 5520 

W--> 211 ccacctttgc ctagtgttag gaaactgaca gaggacagat ggaacaagcc ccagaa- 
E--> 212 gacc 5580 

W--> 213 aagggccaca gagggagcca tacaatgaat ggacactaga gcttttagag gaacttaa- 
E--> 214 ga 5640 

W--> 215 gtgaagctgt tagacatttt cctaggatat ggctccataa cttaggacaa ca- 
E--> 216 tatctatg 5700 

W--> 217 aaacttacgg ggatacttgg gcaggagtgg aagccataat aagaattctg caa- 
E--> 218 caactgc 5760 

W--> 219 tgtttatcca tttcagaatt gggtgtcgac atagcagaat aggcgttact cgacagag- 
E--> 220 ga 5820 

W"-> 221 gagcaagaaa tggagccagt agatcctaga ctagagccct ggaagcatcc aggaagt- 
E--> 222 cag 5880 

E--> 223 cctaaaactg cttgtaccaa ttgctattgt aaaaagtgtt gctttcattg 
E--> 224 ccaagtttgt 594 0 

W--> 225 ttcatgacaa aagccttagg catctcctat ggcaggaaga agcggagaca gcgacgaa- 
E--> 226 ga 6000 

W--> 227 gctcatcaga acagtcagac tcatcaagct tctctatcaa agcagtaagt agta- 
E--> 228 catgta 6060 

W--> 229 atgcaaccta taatagtagc aatagtagca ttagtagtag caataataat agcaa- 
E--> 230 tagtt 6120 

W--> 231 gtgtggtcca tagtaatcat agaatatagg aaaatattaa gacaaagaaa aataga- 
E--> 232 cagg 6180 

W--> 233 ttaattgata gactaataga aagagcagaa gacagtggca atgagagtga aggagaag- 
E--> 234 ta 6240 

W--> 235 tcagcacttg tggagatggg ggtggaaatg gggcaccatg ctccttggga tattgat- 
E--> 236 gat 6300 

W--> 237 ctgtagtgct acagaaaaat tgtgggtcac agtctattat ggggtacctg tgtggaag- 
E--> 238 ga 6360 

W--> 239 agcaaccacc actctatttt gtgcatcaga tgctaaagca tatgatacag aggtaca- 
E--> 240 taa 6420 

W--> 241 tgtttgggcc acacatgcct gtgtacccac agaccccaac ccacaagaag tag- 
E--> 242 tattggt 6480 

W--> 243 aaatgtgaca gaaaatttta acatgtggaa aaatgacatg gtagaacaga tgcatgag- 
E--> 244 ga 6540 

E--> 245 tataatcagt ttatgggatc aaagcctaaa gccatgtgta aaattaaccc 
E--> 246 cactctgtgt 6600 

W--> 247 tagtttaaag tgcactgatt tgaagaatga tactaatacc aatagtagta gcggga- 
E--> 248 gaat 6660 

W--> 249 gataatggag aaaggagaga taaaaaactg ctctttcaat atcagcacaa gcataa- 
E--> 250 gaga 6720 

W--> 251 taaggtgcag aaagaatatg cattctttta taaacttgat atagtaccaa tagataa- 
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E- 


- > 


252 


tac 6780 




W- 


- > 


253 


cagctatagg ttgataagtt gtaacacctc agtcattaca caggcctgtc 


caa- 


E- 


- > 


254 


aggtatc 6840 




W- 


- > 


255 


ctttgagcca attcccatac attattgtgc cccggctggt tttgcgattc 


taa- 


E- 


- > 


256 


aatgtaa 6900 




W- 


-> 


257 


taataagacg ttcaatggaa caggaccatg tacaaatgtc agcacagtac 


aatgtaca- 


E- 


- > 


258 


ca 6960 




W- 


- > 


259 


tggaatcagg ccagtagtat caactcaact gctgttaaat ggcagtctag 


cagaa- 


E- 


- > 


260 


gaaga 7020 




W- 


-> 


261 


tgtagtaatt agatctgcca atttcacaga caatgctaaa accataatag 


tacagct- 


E- 


-> 


262 


gaa 7080 




E- 


-> 


263 


cacatctgta gaaattaatt gtacaagacc caacaacaat acaagaaaaa 




E- 


-> 


264 


gtatccgtat 7140 




W- 


-> 


265 


ccagagggga ccagggagag catttgttac aataggaaaa ataggaaata 


tgaga- 


E- 


-> 


266 


caagc 7200 




W- 


-> 


267 


acattgtaac attagtagag caaaatggaa tgccacttta aaacagatag 


ctagcaa- 


E- 


- > 


268 


att 7260 




E- 


-> 


269 


aagagaacaa tttggaaata ataaaacaat aatctttaag caatcctcag 




E- 


-> 


270 


gaggggaccc 7320 




W- 


-> 


271 


agaaattgta acgcacagtt ttaattgtgg aggggaattt ttctactgta 


attcaa- 


E- 


-> 


272 


caca 7380 




W- 


-> 


273 


actgtttaat agtacttggt ttaatagtac ttggagtact gaagggtcaa 


ataacact- 


E- 


- > 


274 


ga 7440 




W- 


-> 


275 


aggaagtgac acaatcacac tcccatgcag aataaaacaa tttataaaca 


tgtggcag- 


E- 


-> 


276 


ga 7500 




W- 


-> 


277 


agtaggaaaa gcaatgtatg cccctcccat cagtggacaa attagatgtt 


catcaaa- 


E- 


- > 


278 


tat 7560 




W- 


-> 


279 


tactgggctg ctattaacaa gagatggtgg taataacaac aatgggtccg 


agatctt- 


E- 


- > 


280 


cag 7620 




W- 


-> 


281 


acctggagga ggcgatatga gggacaattg gagaagtgaa ttatataaat 


ataaag- 


E- 


- > 


282 


tagt 7680 




W- 


- > 


283 


aaaaattgaa ccattaggag tagcacccac caaggcaaag agaagagtgg 


tgcagaga- 


E- 


-> 


284 


ga 7740 




W- 


- > 


285 


aaaaagagca gtgggaatag gagctttgtt ccttgggttc ttgggagcag 


caggaag- 


E- 


-> 


286 


cac 7800 




W- 


-> 


287 


tatgggctgc acgtcaatga cgctgacggt acaggccaga caattattgt ctgata- 


E- 


- > 


288 


tagt 7860 




W- 


-> 


289 


gcagcagcag aacaatttgc tgagggctat tgaggcgcaa cagcatctgt 


tgcaact- 


E- 


-> 


290 


cac 7920 




W- 


-> 


291 


agtctggggc atcaaacagc tccaggcaag aatcctggct gtggaaagat 


acctaa- 


E- 


-> 


292 


agga 7980 




E- 


-> 


293 


tcaacagctc ctggggattt ggggttgctc tggaaaactc atttgcacca 




E- 


-> 


294 


ctgctgtgcc 8040 




E- 


-> 


295 


ttggaatgct agttggagta ataaatctct ggaacagatt tggaataaca 




E- 


-> 


296 


tgacctggat 8100 




W- 


-> 


297 


ggagtgggac agagaaatta acaattacac aagcttaata cactccttaa 


ttgaa- 


E- 


-> 


298 


gaatc 8160 




E- 


-> 


299 


gcaaaaccag caagaaaaga atgaacaaga attattggaa ttagataaat 




E-- 


-> 


300 


gggcaagttt 8220 
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w- 


-> 


301 


gtggaattgg 


tttaacataa 


caaattggct 


gtggtatata 


aaattattca 


taatga- 


E- 


-> 


302 


tagt 8280 












W- 


-> 


303 


aggaggcttg 


gtaggtttaa 


gaatagtttt 


tgctgtactt 


tctatagtga 


atagagt- 


E- 


-> 


304 


tag 8340 












W- 


-> 


305 


gcagggatat 


tcaccattat 


cgtttcagac 


ccacctccca 


atcccgaggg 


gacccga- 


E- 


-> 


306 


cag 8400 












W- 


-> 


307 


gcccgaagga 


atagaagaag 


aaggtggaga 


gagagacaga 


gacagatcca 


ttcgat- 


E- 


-> 


308 


tagt 8460 












W- 


-> 


309 


gaacggatcc 


ttagcactta 


tctgggacga 


tctgcggagc 


ctgtgcctct 


tcagctac- 


E- 


-> 


310 


ca 8520 












E- 


-> 


311 


ccgcttgaga 


gacttactct 


tgattgtaac 


gaggattgtg 


gaacttctgg 




E- 


-> 


312 


gacgcagggg 


8580 










W- 


-> 


313 


gtgggaagcc 


ctcaaatatt 


ggtggaatct 


cctacagtat 


tggagtcagg 


aactaaa- 



E--> 314 gaa 8640 

W--> 315 tagtgctgtt aacttgctca atgccacagc catagcagta gctgagggga caga- 
E--> 316 tagggt 8700 

W--> 317 tatagaagta ttacaagcag cttatagagc tattcgccac atacctagaa gaataa- 



E- 


-> 


318 


gaca 8760 












W- 


-> 


319 


gggcttggaa 


aggattttgc 


tataagatgg 


gtggcaagtg 


gtcaaaaagt 


agtgt- 


E- 


-> 


320 


gattg 8820 












E- 


-> 


321 


gatggcctgc 


tgtaagggaa 


agaatgagac 


gagctgagcc 


agcagcagat 




E- 


-> 


322 


ggggtgggag 


8880 










W- 


-> 


323 


cagtatctcg 


agacctagaa 


aaacatggag 


caatcacaag 


tagcaataca 


gcagctaa- 


E- 


-> 


324 


ca 8940 












W- 


-> 


325 


atgctgcttg 


tgcctggcta 


gaagcacaag 


aggaggaaga 


ggtgggtttt 


ccagtca- 


E- 


-> 


326 


cac 9000 












W- 


-> 


327 


ctcaggtacc 


tttaagacca 


atgacttaca 


aggcagctgt 


agatcttagc 


cactttt- 


E- 


-> 


328 


taa 9060 












W- 


-> 


329 


aagaaaaggg 


gggactggaa 


gggctaattc 


actcccaaag 


aagacaagat atcctt- 


E- 


-> 


330 


gate 9120 












E- 


-> 


331 


tgtggatcta 


ccacacacaa 


ggctacttcc 


ctgattggca 


gaactacaca 




E- 


-> 


332 


ccagggccag 


9180 










W- 


-> 


333 


gggtcagata 


tccactgacc 


tttggatggt 


gctacaagct 


agtaccagtt 


gagccaga- 


E- 


-> 


334 


ta 9240 












E- 


-> 


335 


aggtagaaga 


ggccaataaa 


ggagagaaca 


ccagcttgtt 


acaccctgtg 




E- 


-> 


336 


agcctgcatg 


9300 










W- 


-> 


337 


gaatggatga 


ccctgagaga 


gaagtgttag 


agtggaggtt 


tgacagccgc 


ctag- 


E- 


-> 


338 


catttc 9360 










E- 


-> 


339 


atcacgtggc 


ccgagagctg 


catccggagt 


acttcaagaa 


ctgctgacat 




E- 


-> 


340 


cgagcttgct 


9420 










E- 


-> 


341 


acaagggact ttccgctggg 


gactttccag 


ggaggcgtgg 


cctgggcggg 




E- 


-> 


342 


actggggagt 


9480 










E- 


-> 


343 


ggcgagccct cagatgctgc 


atataagcag 


ctgctttttg 


cctgtactgg 




E- 


-> 


344 


gtctctctgg 


9540 










E- 


-> 


345 


ttagaccaga 


tctgagcctg 


ggagctctct 


ggctaactag 


ggaacccact 




E- 


-> 


346 


gcttaagcct 


9600 










E- 


-> 


347 


caataaagct tgccttgagt 


gcttcaagta 


gtgtgtgccc 


gtctgttgtg 




E- 


-> 


348 


tgactctggt 


9660 










E- 


-> 


349 


aactagagat 


ccctcagacc 


cttttagtca 


gtgtggaaaa 


tctctagca 
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E--> 350 9709 

633 <210> SEQ ID NO: 3 

634 <211> LENGTH: 107 

635 <212> TYPE: DNA 

636 <213> ORGANISM: Artificial Sequence 
63 8 <220> FEATURE: 

639 <223> OTHER INFORMATION: Description of the artificial sequence: 

640 oligonucleotide for cloning /i/f^' 
642 <400> SEQUENCE: 3 /pv' 

W--> 64 3 aagatgtagt aattagatct gccaatttca cagacaatgc taaaaccata atagta- * 
E--> 644 cage 60 

E--> 64 5 tgaacacatc gttagaaatt aattgtacaa gacccaacaa caataca 
E--> 646 107 

649 <210> SEQ ID NO: 4 

650 <211> LENGTH: 120 

651 <212> TYPE: DNA 

652 <213> ORGANISM: Artificial Sequence 

654 <220> FEATURE: 

655 <223> OTHER INFORMATION: Description of the artificial sequence: 

656 oligonucleotide for cloning 

658 <220> FEATURE: 

659 <221> NAME/KEY: misc_f eature 

660 <222> LOCATION: (97).. (99) 

661 <223> OTHER INFORMATION: Sequence at this position: (GA) (AT) (GATC) , ie . 

662 base at position 97 can be G or A, base at 

663 position 98 can be A or T, and base at 

664 position 99 can be G, A, T or C. 
666 <400> SEQUENCE: 4 

E--> 667 ttttgctcta gaaatgttac aatgtgcttg tcttatgtct cctgttgcag 

668 cttctgttgc 60 yOy^^^^^^^^''^^ 
E--> 669 atgaaatgc^ ctccctggtc cgatatggat actatgrwnt tttcttgtat 

670 tgttgttggg 120 

673 <210> SEQ ID NO: 5 

674 <211> LENGTH: 17 

675 <212> TYPE: DNA 

676 <213> ORGANISM: Artificial Sequence 

678 <220> FEATURE: 

679 <223> OTHER INFORMATION: Description of the artificial sequence: 

680 sequencing primer y^x^^'^'*''^^'"'^ 
682 <400> SEQUENCE: 5 ^ 

E--> 683 ccatgtacaa atgtcag 
684 17 

687 <210> SEQ ID NO: 6 

688 <211> LENGTH: 17 

689 <212> TYPE: DNA 

690 <213> ORGANISM: Artificial Sequence 

692 <220> FEATURE: 

693 <223> OTHER INFORMATION: Description of the artificial sequence: 

694 sequencing primer 
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696 


E- 


-> 


697 






698 






701 






702 






703 






704 






706 






707 






708 






710 


E- 


-> 


711 






712 






715 






716 






717 






718 






720 






721 






722 






724 


E- 


-> 


725 






726 






729 






730 






731 






732 






734 






735 






737 






738 






739 






740 






742 






743 






744 






745 






747 


E- 


-> 


748 






749 


E- 


-> 


750 






751 


W- 


-> 


752 


E- 


-> 


753 


W- 


-> 


754 


E- 


-> 


755 


W- 


-> 


756 


E- 


-> 


757 


W- 


-> 


758 



RAW SEQUENCE LISTING DATE: 08/23/2001 

PATENT APPLICATION: US/09/913,159 TIME: 14:18:57 

Input Set : A:\P057760.app 

Output Set: N:\CRF3\08162001\I913159.raw 



<400> SEQUENCE: 6 
aaaactgtgc gttacaa 

17 

<210> SEQ ID NO: 7 
<211> LENGTH: 17 
<212> TYPE: DNA 

<213> ORGANISM: Artificial Sequence 
<220> FEATURE: 

<223> OTHER INFORMATION: Description of the artificial sequence: 

sequencing primer /l^'^* 
<400> SEQUENCE: 7 / 
gtaaaacgac ggccagt 
17 

<210> SEQ ID NO: 8 
<211> LENGTH: 17 
<212> TYPE: DNA 

<213> ORGANISM: Artificial Sequence 
<220> FEATURE: 

<223> OTHER INFORMATION: Description of the artificial sequence: 

sequencing primer 
<400> SEQUENCE: 8 
caggaaacag ctatgac 
17 

<210> SEQ ID NO: 9 
<211> LENGTH: 2148 
<212> TYPE: DNA 

<213> ORGANISM: Artificial Sequence 
<220> FEATURE: 

<223> OTHER INFORMATION: Description of the artificial sequence: synthetic DNA 
<220> FEATURE: 

<221> NAME/KEY: misc_feature 
<222> LOCATION: (3).. (9) 

<223> OTHER INFORMATION: BstEII cleavage site 
<220> FEATURE: 

<221> NAME/KEY: misc_f eature 

<222> LOCATION: ( 2143 ) . . ( 214 8 ) 

<223> OTHER INFORMATION: BamHI cleavage site 

<400> SEQUENCE: 9 

tgggtcaccg tctattatgg ggtgcctgtg tggaaggaag caaccaccac 
tctattttgt 60 

gcatcagatg ctaaagcata tgatacagag gtacataatg tttgggccac 

acatgcctgt 120 

gtacccacag accccaaccc acaagaagta gtattggtaa atgtgacaga aaattt- 
taac 180 

atgtggaaaa atgacatggt agaacagatg catgaggata taatcagt.tt: atgggat- 
caa 240 

agccttaagc catgtgtaaa alitaacccca ctctgtgtta gtttaaagtg cact- 
gatttg 300 

aagaatgata ctaataccaa tagtagtagc gggagaat:ga taatggagaa aggagaga- 
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E--> 759 ta 360 

W--> 760 aaaaactgca gcttcaatat cagcacaagc ataagagata aggtgcagaa agaa- 
E--> 761 tatgca 420 

W--> 762 ttcttttata aacttgatat agtaccaata gataatacca gctataggtt ga- 
E--> 763 taagttgt 480 . 

W--> 764 aacacctcag tgatcacaca ggcctgtcca aaggtatcct ttgagccaat tcccata- 
E--> 765 cat 540 

W--> 766 tattgtgccc cggctggttt tgcgattcta aaatgtaata ataagacgtt caatggaa- 
E--> 767 ca 600 

W--> 768 ggaccatgta caaatgtcag cacagtacaa tgtacacatg gaattcgacc agtagtat- 
E--> 769 ca 660 

E--> 770 actcaactgc tgttaaatgg cagtctagca gaagaagatg tagtaattag 
E--> 771 atctgccaat 720 

W--> 772 ttcacagaca atgctaaaac cataatagta cagctgaaca catctgtaga aat- 
E--> 773 taattgt 780 

W--> 774 acaagaccca acaacaatac aagaaaaagt atccgtatcc agaggggacc agggagag- 
E--> 775 ca 840 

W--> 776 tttgttacaa taggaaaaat aggaaatatg agacaagcac attgtaacat ttctagag- 
E--> 777 ca 900 

W--> 778 aaatggaatg ccactttaaa acagatagct agcaaattaa gagaacaatt tggaaa- 
E--> 779 taat 960 

W--> 780 aaaacaataa tctttaagca gtcatccgga ggggacccag aaattgtaac gca- 
E--> 781 cagtttt 1020 

E--> 782 aattgtggag gggaattttt ctactgtaat tcaacacaac tgtttaatag 
E--> 783 tacttggttt 1080 

W--> 784 aatagtactt ggagtactga agggtcaaat aacactgaag gaagtgacac aatca- 
E--> 785 cactc 1140 

E--> 786 ccatgcagaa taaaacaatt tataaacatg tggcaggaag taggaaaagc 
E--> 787 aatgtatgcc 1200 

W--> 788 cctcccatca gtggccaaat tagatgttca tcaaatatta ctgggctgct at- 
E--> 789 taactcga 1260 

W--> 790 gatggtggta ataacaacaa tgggtccgag attttcagac ctggaggagg cgatat- 
E--> 791 gagg 1320 

W--> 792 gataattgga gaagtgaatt atataaatat aaagtagtaa aaattgaacc attaggag- 
E--> 793 ta 1380 

W--> 794 gcacccacca aggcaaagag acgcgtggtg cagagagaaa agcgcgcagt gggaa- 
E--> 795 tagga 1440 

W--> 796 gctctgttcc ttgggttctt gggagcagca ggaagcacta tgggcgcagc gtcaat- 
E--> 797 gacg 1500 

E--> 798 ctgacggtac aggccagaca attattgtct gatatagtgc agcagcagaa 
E--> 799 caatttgctg 1560 

W--> 800 agggcaattg aggcgcaaca gcatctgttg caactcacag tctggggcat caaa- 
E--> 801 cagctc 1620 

E--> 802 caggcaagaa tcctggctgt ggaaagatac ctaaaggatc aacagctcct 
E--> 803 ggggatttgg 1680 

W--> 804 ggttgctctg gaaaactcat ttgcaccact gctgtgcctt ggaatgctag ttggag- 
E--> 805 taat 1740 

W--> 806 aaatctctgg aacagatttg gaataacatg acctggatgg agtgggacag agaaat- 
E--> 807 taac 1800 
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W--> 808 aattacacaa gcttaataca ctccttaatt gaagaatcgc aaaaccagca agaaaa- 
E--> 809 gaat 1860 

W--> 810 gaacaagaat tattggaatt agataaatgg gcaagtttgt ggaattggtt taacataa- 
E--> 811 ca 1920 

W--> 812 aattggctgt ggtatataaa attattcata atgatagtag gaggcttggt aggtt- 
E--> 813 taaga 1980 

W--> 814 atagtttttg ctgtactttc tatagtgaat agagttaggc agggatattc accat- 
E--> 815 tatcg 2040 

W--> 816 tttcagaccc acctcccaat cccgagggga cccgacaggc ccgaaggaat agaagaa- 
E--> 817 gaa 2100 

E--> 818 ggtggagaga gagacagaga cagatccatt cgattagtga acggatcc 
E--> 819 2148 

822 <210> SEQ ID NO: 10 

823 <211> LENGTH: 6229 

824 <212> TYPE: DNA 

825 <213> ORGANISM: Artificial Sequence 

827 <220> FEATURE: 

828 <223> OTHER INFORMATION: Description of the artificial sequence: synthetic DNA 

830 <220> FEATURE: 

831 <221> NAME/KEY: sig_peptide 

832 <222> LOCATION: ( 1293 )..( 1295 ) 

833 <223> OTHER INFORMATION: env ATG 

835 <220> FEATURE: 

836 <221> NAME/KEY: misc_f eature 

837 <222> LOCATION: ( 1377 )..( 1379 ) 

838 <223> OTHER INFORMATION: env AGT, gpl20 Start 

840 <220> FEATURE: 

841 <221> NAME/KEY: misc_f eature 

842 <222> LOCATION: ( 1397 ).. (1403 ) 

843 <223> OTHER INFORMATION: BstEII cleavage site 

845 <220> FEATURE: 

846 <221> NAME/KEY: inisc_f eature 

847 <222> LOCATION: ( 3537 )..( 3542 ) 

848 <223> OTHER INFORMATION: BamHI cleavage site 

850 <220> FEATURE: 

851 <221> NAME/KEY: inisc_f eature 

852 <222> LOCATION: { 3855 )..( 3857 ) 

853 <223> OTHER INFORMATION: env TAA, stop 
855 <400> SEQUENCE: 10 

W--> 856 ctgacgcgcc ctgtagcggc gcattaagcg cggcgggtgt ggtggttacg cgcagcgt- 
E--> 857 ga 60 

E--> 858 ccgctacact tgccagcgcc ctagcgcccg ctcctttcgc tttcttccct 
E--> 859 tcctttctcg 120 

E--> 860 ccacgttcgc cggctttccc cgtcaagctc taaatcgggg gctcccttta 
E--> 861 gggttccgat 180 

E--> 862 ttagtgcttt acggcacctc gaccccaaaa aacttgatta gggtgatggt 
E--> 863 tcacgtagtg 240 

W--> 864 ggccatcgcc ctgatagacg gtttttcgcc ctttgacgtt ggagtccacg ttctttaa- 
E--> 865 ta 300 
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E- 


-> 


866 


gtggactctt 


gttccaaact 


ggaacaacac 


tcaaccctat 


ctcggtctat 




E- 


-> 


867 


tcttttgatt 


360 










W- 


-> 


868 


tatiaagggat 


tttgccgatt 


tcggcctatt 


ggttaaaaaa 


tgagctgatt 


taacaa- 


E- 


-> 


869 


aaat 420 












E- 


-> 


870 


ttaacgcgaa 


ttttaacaaa 


atattaacgc 


ttacaatttc 


cattcgccat 




E- 


-> 


871 


tcaggctgcg 


480 










W- 


-> 


872 


caactgttgg 


gaagggcgat 


cggtgcgggc 


ctcttcgcta 


ttacgccagc 


tggcgaa- 


E- 


-> 


873 


agg 540 












W- 


-> 


874 


gggatgtgct 


gcaaggcgat 


taagttgggt 


aacgccaggg 


ttttcccagt 


cac- 


E- 


-> 


875 


gacgttg 600 










W- 


-> 


876 


taaaacgacg 


gccagtgagc 


gtctagttat 


taatagtaat 


caattacggg 


gtcat- 


E- 


-> 


877 


tagtt 660 












E- 


-> 


878 


catagcccat 


atatggagtt 


ccgcgttaca 


taacttacgg 


taaatggccc 




E- 


-> 


879 


gcctggctga 


720 










W- 


-> 


880 


ccgcccaacg 


acccccgccc 


attgacgtca 


ataatgacgt 


atgttcccat 


ag- 


E- 


-> 


881 


taacgcca 780 










E- 


-> 


882 


atagggactt 


tccattgacg 


tcaatgggtg 


gagtatttac 


ggtaaactgc 




E- 


-> 


883 


ccacttggca 


840 










W- 


-> 


884 


gtacatcaag 


tgtatcatat 


gccaagtacg 


ccccctattg 


acgtcaatga 


cggtaa- 


E- 


-> 


885 


atgg 900 












W- 


-> 


886 


cccgcctggc 


attatgccca 


gtacatgacc 


ttatgggact 


ttcctacttg 


gcagta- 


E- 


-> 


887 


catc 960 












E- 


-> 


888 


tacgtattag 


tcatcgctat 


taccatggtg 


atgcggtttt 


ggcagtacat 




E- 


-> 


889 


caatgggcgt 


1020 










E- 


-> 


890 


ggatagcggt 


ttgactcacg 


gggatttcca 


agtctccacc 


ccattgacgt 




E- 


-> 


891 


caatgggagt 


1080 










E- 


-> 


892 


ttgttttggc 


accaaaatca 


acgggacttt 


ccaaaatgtc 


gtaacaactc 




E- 


-> 


893 


cgccccattg 


1140 










E- 


-> 


894 


acgcaaatgg 


gcggtaggcg 


tgtacggtgg 


gaggtctata 


taagcagagc 




E- 


-> 


895 


tcgtttagtg 


1200 










W- 


-> 


896 


aaccgtcaga 


tcgcctggag 


acgccatcca 


cgctgttttg 


acctccatag 


aaga- 


E- 


-> 


897 


caccgg 1260 










W- 


-> 


898 


gacaattcga 


gctcggtacc 


gtcgacgcca 


ccatgagagt 


gaaggagaag 


tatcag- 


E- 


-> 


899 


cact 1320 












E- 


-> 


900 


tgtggagatg 


ggggtggaaa 


tggggcacca 


tgctccttgg 


gatattgatg 




E- 


-> 


901 


atctgtagtg 


1380 










W- 


-> 


902 


ctacagaaaa 


attgtgggtc 


accgtctatt 


atggggtacc 


tgtgtggaag 


gaagcaac- 


E- 


-> 


903 


ca 1440 












E- 


-> 


904 


ccactctatt 


ttgtgcatca 


gatgctaaag 


catatgatac 


agaggtacat 




E- 


-> 


905 


aatgtttggg 


1500 










W- 


-> 


906 


ccacacatgc 


ctgtgtaccc 


acagacccca 


acccacaaga 


agtagtattg 


gtaaatgt- 


E- 


-> 


907 


ga 1560 












W- 


-> 


908 


cagaaaattt 


taacatgtgg 


aaaaatgaca 


tggtagaaca 


gatgcatgag 


gatataat- 


E- 


-> 


909 


ca 1620 












W- 


-> 


910 


gtttatggga 


tcaaagccta 


aagccatgtg 


taaaattaac 


cccactctgt 


gttagtt- 


E- 


-> 


911 


taa 1680 












W- 


-> 


912 


agtgcactga 


tttgaagaat 


gatactaata 


ccaat:agtag 


tagcgggaga 


atga- 


E- 


-> 


913 


taatgg 1740 










W- 


-> 


914 


agaaaggaga 


gataaaaaac 


tgctctttca 


atatcagcac 


aagcataaga 


ga- 
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E--> 915 taaggtgc 1800 

W--> 916 agaaagaata tgcattcttt tataaacttg atatagtacc aatagataat accagcta- 
E--> 917 ta 1860 

E--> 918 ggttgataag ttgtaacacc tcagtcatta cacaggcctg tccaaaggta 
E--> 919 tcctttgagc 1920 

W--> 920 caattcccat acattattgt gccccggctg gttttgcgat tctaaaatgt aataa- 
E--> 921 taaga 1980 

W--> 922 cgttcaatgg aacaggacca tgtacaaatg tcagcacagt acaatgtaca catggaat- 
E--> 923 ca 2040 

W--> 924 ggccagtagt atcaactcaa ctgctgttaa atggcagtct agcagaagaa gatgtag- 
E--> 925 taa 2100 

W--> 926 ttagatctgc caatttcaca gacaatgcta aaaccataat agtacagctg aaca- 
E--> 927 catctg 2160 

W--> 928 tagaaattaa ttgtacaaga cccaacaaca atacaagaaa aagtatccgt atcca- 
E--> 929 gaggg 2220 

W--> 930 gaccagggag agcatttgtt acaataggaa aaataggaaa tatgagacaa gca- 
E--> 931 cattgta 2280 

W--> 932 acattagtag agcaaaatgg aatgccactt taaaacagat agctagcaaa ttaaga- 
E--> 933 gaac 2340 

W--> 934 aatttggaaa taataaaaca ataatcttta agcaatcctc aggaggggac ccagaa- 
E--> 935 attg 2400 

W--> 936 taacgcacag ttttaattgt ggaggggaat ttttctactg taattcaaca caactgtt- 
E--> 937 ta 2460 

W--> 93 8 atagtacttg gtttaatagt acttggagta ctgaagggtc aaataacact gaag- 
E--> 939 gaagtg 2520 

W--> 94 0 acacaatcac actcccatgc agaataaaac aatttataaa catgtggcag gaagtag- 
E--> 941 gaa 2580 

W--> 942 aagcaatgta tgcccctccc atcagtggac aaattagatg ttcatcaaat at- 
E--> 943 tactgggc 2640 

E--> 944 tgctattaac aagagatggt ggtaataaca acaatgggtc cgagatcttc 
E--> 945 agacctggag 2700 

W--> 946 gaggcgatat gagggacaat tggagaagtg aattatataa atataaagta gtaaaa- 
E--> 947 attg 2760 

W--> 948 aaccattagg agtagcaccc accaaggcaa agagaagagt ggtgcagaga gaaaaaa- 
E--> 949 gag 2820 

W--> 950 cagtgggaat aggagctttg ttccttgggt tcttgggagc agcaggaagc ac- 
E--> 951 tatgggct 2880 

W--> 952 gcacgtcaat gacgctgacg gtacaggcca gacaattatt gtctgatata gtgcag- 
E--> 953 cage 2940 

E--> 954 agaacaattt gctgagggct attgaggcgc aacagcatct gttgcaactc 
E--> 955 acagtctggg 3000 

W--> 956 gcatcaaaca gctccaggca agaatcctgg ctgtggaaag atacctaaag gatcaa- 
E--> 957 cage 3060 

E--> 958 tcctggggat ttggggttgc tctggaaaac tcatttgcac cactgctgtg 
E--> 959 ccttggaatg 3120 

E--> 960 ctagttggag taataaatct ctggaacaga tttggaataa catgacctgg 
E--> 961 atggagtggg 3180 

W--> 962 acagagaaat taacaattac acaagcttaa tacactcctt aattgaagaa tcgcaa- 
E--> 963 aacc 3240 
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E--> 964 agcaagaaaa gaatgaacaa gaattattgg aattagataa atgggcaagt 
E--> 965 ttgtggaatt 3300 

W--> 966 ggtttaacat aacaaattgg ctgtggtata taaaattatt cataatgata gtag- 
E--> 967 gaggct 3360 

E--> 968 tggtaggttt aagaatagtt tttgctgtac tttctatagt gaatagagtt 
E--> 969 aggcagggat 3420 

E--> 970 attcaccatt atcgtttcag acccacctcc caatcccgag gggacccgac 
E--> 971 aggcccgaag 3480 

E--> 972 gaatagaaga agaaggtgga gagagagaca gagacagatc cattcgatta 
E--> 973 gtgaacggat 3540 

W--> 974 ccttagcact tatctgggac gatctgcgga gcctgtgcct cttcagctac caccgctt- 
E--> 975 ga 3600 

E--> 976 gagacttact cttgattgta acgaggattg tggaacttct gggacgcagg 
E--> 977 gggtgggaag 3660 

W--> 978 ccctcaaata ttggtggaat ctcctacagt attggagtca ggaactaaag aa- 
E--> 979 tagtgctg 3720 

W--> 980 ttaacttgct caatgccaca gccatagcag tagctgaggg gacagatagg gttata- 
E--> 981 gaag 3780 

E--> 982 tattacaagc agcttataga gctattcgcc acatacctag aagaataaga 
E--> 983 cagggcttgg 3840 

E--> 984 aaaggatttt gctataagat: gggtggcaag tggtcaaaaa gtagtgtgat 
E--> 985 tggatggcct 3900 

W--> 986 gctgtaaggg aaagaatgag acgagctgag ccagcagcag atggggtggg agcag- 
E--> 987 tatct 3960 

W--> 988 cgagatctag actagaacta gcttcgatcc agacatgata agatacattg at- 
E--> 989 gagtttgg 4020 

E--> 990 acaaaccaca actagaatgc agtgaaaaaa atgctttatt tgtgaaattt 
E--> 991 gtgatgctat 4080 

W--> 992 tgctttattt gtaaccatta taagctgcaa taaacaagtt aacaacaaca attgcatt- 
E--> 993 ca 4140 

W--> 994 ttttatgttt caggttcagg gggaggtgtg ggaggttttt taaagcaagt aa- 
E--> 995 aacctcta 4200 

W--> 996 caaatgtggt atggctgatt atgatcctgc ctcgcgcgtt tcggtgatga cggtgaa- 
E--> 997 aac 4260 

E--> 998 ctctgacaca tgcagctccc ggagacggtc acagcttgtc tgtaagcgga 
E--> 999 tgccgggagc 4 320 

W--> 1000 agacaagccc gtcagggcgc gtcagcgggt gttggcgggt gtcggggcgc agccat- 
E--> 1001 gacc 4380 

W--> 1002 cagtcacgta gcgatagcgg agtgtatact ggcttaacta tgcggcatca gagca- 
E--> 1003 gattg 4440 

E--> 1004 tactgagagt gcaccatatg tcgggccgcg ttgctggcgt ttttccatag 
E--> 1005 gctccgcccc 4500 

W--> 1006 cctgacgagc atcacaaaaa tcgacgctca agtcagaggt ggcgaaaccc gacaggac- 
E--> 1007 ta 4560 

E--> 1008 taaagatacc aggcgtttcc ccctggaagc tccctcgtgc gctctcctgt 
E--> 1009 tccgaccctg 4620 

W--> 1010 ccgcttaccg gatacctgtc cgcctttctc ccttcgggaa gcgtggcgct ttctca- 
E--> 1011 tagc 4680 

E--> 1012 tcacgctgta ggtatctcag ttcggtgtag gtcgttcgct ccaagctggg 
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E--> 1013 ctgtgtgcac 4740 

E--> 1014 gaaccccccg ttcagcccga ccgctgcgcc ttatccggta actatcgtct 
E--> 1015 tgagtccaac 4 800 

W--> 1016 ccggtaagac acgacttatc gccactggca gcagccactg gtaacaggat tagca- 
E--> 1017 gagcg 4860 

W--> 1018 aggtatgtag gcggtgctac agagttcttg aagtggtggc ctaactacgg ctacacta- 
E--> 1019 ga 4920 

W--> 1020 aggacagtat ttggtatctg cgctctgctg aagccagtta ccttcggaaa aa- 
E--> 1021 gagttggt 4980 

W--> 1022 agctcttgat ccggcaaaca aaccaccgct ggtagcggtg gtttttttgt ttgcaag- 
E--> 1023 cag 5040 

E--> 1024 cagattacgc gcagaaaaaa aggatctcaa gaagatcctt tgatcttttc 
E--> 1025 tacggggtct 5100 

W--> 1026 gacgctcagt ggaacgaaaa ctcacgttaa gggattttgg tcatgagatt atcaa- 
E--> 1027 aaagg 5160 

W--> 1028 atcttcacct agatcctttt aaattaaaaa tgaagtttta aatcaatcta aagtata- 
E--> 1029 tat 5220 

E--> 103 0 gagtaaactt ggtctgacag ttaccaatgc ttaatcagtg aggcacctat 
E--> 1031 ctcagcgatc 5280 

W--> 1032 tgtctatttc gttcatccat agttgcctga ctccccgtcg tgtagataac tacga- 
E--> 1033 tacgg 5340 

E--> 1034 gagggcttac catctggccc cagtgctgca atgataccgc gagacccacg 
E--> 1035 ctcaccggct 5400 

E--> 1036 ccagatttat cagcaataaa ccagccagcc ggaagggccg agcgcagaag 
E--> 1037 tggtcctgca 5460 

W--> 103 8 actttatccg cctccatcca gtctattaat tgttgccggg aagctagagt aag- 
E--> 1039 tagttcg 5520 

E--> 1040 ccagttaata gtttgcgcaa cgttgttgcc attgctacag gcatcgtggt 
E--> 1041 gtcacgctcg 5580 

W--> 104 2 tcgtttggta tggcttcatt cagctccggt tcccaacgat caaggcgagt tacat- 
E--> 1043 gatcc 5640 

W--> 1044 cccatgttgt gcaaaaaagc ggttagctcc ttcggtcctc cgatcgttgt cagaag- 
E--> 1045 taag 5700 

W--> 104 6 ttggccgcag tgttatcact catggttatg gcagcactgc ataattctct tactgt- 
E--> 1047 catg 5760 

W--> 1048 ccatccgtaa gatgcttttc tgtgactggt gagtactcaa ccaagtcatt ctgagaa- 
E--> 1049 tag 5820 

W--> 1050 tgtatgcggc gaccgagttg ctcttgcccg gcgtcaatac gggataatac cgcgcca- 
E--> 1051 cat 5880 

W--> 1052 agcagaactt taaaagtgct catcattgga aaacgttctt cggggcgaaa actct- 
E--> 1053 caagg 5940 

W--> 1054 atcttaccgc tgttgagatc cagttcgatg taacccactc gtgcacccaa ctgatctt- 
E--> 1055 ca 6000 

W--> 1056 gcatctttta ctttcaccag cgtttctggg tgagcaaaaa caggaaggca aa- 
E--> 1057 atgccgca 6060 

W--> 1058 aaaaagggaa taagggcgac acggaaatgt tgaatactca tactcttcct ttttcaa- 
E--> 1059 tat 6120 

W--> 1060 tattgaagca tttatcaggg ttattgtctc atgagcggat acatatttga atgtatt- 
E--> 1061 tag 6180 
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E--> 1062 aaaaataaac aaataggggt tccgcgcaca tttccccgaa aagtgccac 
E--> 1063 6229 

1066 <210> SEQ ID NO: 11 

1067 <211> LENGTH: 860 

1068 <212> TYPE: DNA 

1069 <213> ORGANISM: Human immunodeficiency virus 

1071 <220> FEATURE: 

1072 <221> NAME/KEY: misc_feature 

1073 <222> LOCATION: (1)..(860) 

1074 <223> OTHER INFORMATION: PI--932 original sequence V1-V2-V3 -loop 
1076 <400> SEQUENCE: 11 

W--> 1077 tgtgtaccca cagaccccaa cccacaaaag gtagtattgg aaaatgtgac agaa- 
E--> 1078 aatttt 60 

E--> 1079 aacatgtgga aaaatgacat ggtagaacag atgcatgagg atataatcaa 
E--> 1080 tttatgggat 120 

W--> 1081 caaagcctaa agccatgtgt aaaactaacc ccactctgtg ttactttaaa ttgcact- 
E--> 1082 gat 180 

W--> 1083 gctgatttaa attgcaataa tactgattta aattgcacta aagctaattt ggggaa- 
E--> 1084 aaat 240 

W--> 1085 actcataaca atactattag tgggaaaata atagagaaag tagaaataaa aa- 
E--> 1086 actgctct 300 

W--> 1087 ttcaaggtca ccacaggcat aagggataag atgcaaaaag aatatgcact tttgaa- 
E--> 1088 taaa 360 

W--> 1089 cttgatatag taccaataga taatgataag aataatacta actttatatt ga- 
E--> 1090 taagttgt 420 

W--> 1091 aacacctcga ccattacaca ggcctgtcca aaggtatcct ttgagccaat tcccata- 
E--> 1092 cat 480 

E--> 1093 ttttgtgccc cggctggttt tgcgattcta aagtgtaatg aaaagagtta 
E--> 1094 cagtggaaaa 54 0 

W--> 1095 ggaccatgta aaaatgtcag cacagtacaa tgtacacatg gaattaggcc agtagtgt- 
E--> 1096 ca 600 

W--> 1097 actcaactgc tgttgaatgg cagtctagca gaaaaagaag tagtaattag atctga- 
E--> 1098 gaat 660 

W--> 1099 ttcacagaca atgctaaaac cataatagta cagctgaagg aatctgtaaa cat- 
E--> 1100 tacttgt 720 

W--> 1101 ataagacccc acaacactgt aacagacagg atacatatag ggccagggag atcattt- 
E--> 1102 cat 780 

W--> 1103 acaacaagaa aaataaaagg agatataaga caagcacatt gtagccttag gagaa- 
E--> 1104 aagat 840 
E--> 1105 tggaataaca ctttacaaga 
E--> 1106 860 

1109 <210> SEQ ID NO: 12 

1110 <211> LENGTH: 870 

1111 <212> TYPE: DNA 

1112 <213> ORGANISM: Artificial Sequence 

1114 <220> FEATURE: 

1115 <223> OTHER INFORMATION: Description of the artificial sequence: PI-932 

1116 gene cassette, comprising the cleavage sites for 

1117 restriction enzymes BspTl, PstI, Bell, EcoRI, 
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1118 






1120 


w- 


-> 


1121 


E- 


-> 


1122 


E- 


-> 


1123 


E- 


-> 


1124 


W- 


-> 


1125 


E- 


-> 


1126 


W- 


-> 


1127 


E- 


-> 


1128 


W- 


-> 


1129 


E- 


-> 


1130 


W- 


-> 


1131 


E- 


-> 


1132 


W- 


-> 


1133 


E- 


-> 


1134 


W- 


-> 


1135 


E- 


-> 


1136 


E- 


-> 


1137 


E- 


-> 


1138 


W- 


-> 


1139 


E- 


-> 


1140 


W- 


-> 


1141 


E- 


-> 


1142 


W- 


-> 


1143 


E- 


-> 


1144 


W- 


-> 


1145 


E- 


-> 


1146 


W- 


-> 


1147 


E- 


-> 


1148 


E- 


-> 


1149 


E- 


-> 


1150 



RAW SEQUENCE LISTING DATE: 08/23/2001 

PATENT APPLICATION: US/09/913,159 TIME: 14:18:58 

Input Set : A:\P057760.app 

Output Set: N:\CRF3\08162001\I913159.raw 

Bglll, PvuII, Xball, Nhel 
<400> SEQUENCE: 12 

tgtgtaccca cagaccccaa cccacaaaag gtagtattgg aaaatgtgac agaa- 
aatttt 60 

aacatgtgga aaaatgacat ggtagaacag atgcatgagg atataatcaa 
tttatgggat 120 

caaagcctta agccatgtgt aaaactaacc ccactctgtg ttactttaaa ttgcact- 
gat 180 

gctgatttaa attgcaataa tactgattta aattgcacta aagctaattt ggggaa- 
aaat 24 0 

actcataact gcagtattag tgggaaaata atagagaaag tagaaataaa aa- 
actgctct 300 

ttcaaggtca ccacaggcat aagggataag atgcaaaaag aatatgcact tttgaa- 
taaa 360 

cttgatatag taccaataga taatgataag aataatacta actttatatt ga- 
taagttgt 420 

aacacctcgg tga'tcacaca ggcctgtcca aaggtatcct ttgagccaat tcccata- 
cat 480 

ttttgtgccc cggctggttt tgcgattcta aagtgtaatg aaaagagtta 
cagtggaaaa 540 

ggaccatgta aaaatgtcag cacagtacaa tgtacacatg gaattcggcc agtagtgt- 
ca 600 

actcaactgc tgttgaatgg cagtctagca gaaaaagaag tagtaattag atctga- 
gaat 660 

ttcacagaca atgctaaaac cataatagta cagctgaagg aatctgtaaa cat- 
tacttgt. 720 

ataagacccc acaacactgl: aacagacagg atacatatag ggccagggag atcattt- 
cat 780 

acaacaagaa aaataaaagg agat:at:aaga caagcacatt g'tagcct'ttc tagaa- 
aagat 84 0 

tggaataaca ctttacaaga gatagctagc 
870 



Use of n and/or Xaa has been detected in the Sequence Listing. 
. Sew the sequence Ustin 
//J)\. explanation is presented m the <220> to <m> neias oi 
\ each sequence using n ox Xaa- 
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L:ll M:270 C: Current Application Number differs, Replaced Application Number 

L:12 M:271 C: Current Filing Date differs. Replaced Current Filing Date 

L:27 M:334 W: (2) Invalid Amino Acid in Coding Region, NUMBER OF INVALID KEYS:6 

L:28 M:336 W: Invalid Amino Acid Number in Coding Region, SEQ ID:1 

L:28 M:254 E: No. of Bases conflict, LENGTH : Input : 60 Counted: 2 SEQ:1 

L:29 M:334 W: (2) Invalid Amino Acid in Coding Region, NUMBER OF INVALID KEYS : 6 

L:30 M:336 W: Invalid Amino Acid Number in Coding Region, SEQ ID:1 

M:254 Repeated in SeqNo=l 

L:31M:334W: (2) Invalid Amino Acid in Coding Region, NUMBER OF INVALID KEYS: 6 

L:32 M:336 W: Invalid Amino Acid Number in Coding Region, SEQ ID:1 

L:33M:334W: (2) Invalid Amino Acid in Coding Region, NUMBER OF INVALID KEYS : 6 

L:34 M:336 W: Invalid Amino Acid Number in Coding Region, SEQ ID:1 

L:39 M:334 W: (2) Invalid Amino Acid in Coding Region, NUMBER OF INVALID KEYS: 6 

L:40 M:336 W: Invalid Amino Acid Number in Coding Region, SEQ ID:1 

L:41 M:334 W: (2) Invalid Amino Acid in Coding Region, NUMBER OF INVALID KEYS : 6 

L:42 M:336 W: Invalid Amino Acid Number in Coding Region, SEQ ID:1 

L:45M:334W: (2) Invalid Amino Acid in Coding Region, NUMBER OF INVALID KEYS : 6 

L:46 M:336 W: Invalid Amino Acid Number in Coding Region, SEQ ID:1 

L:47M:334W: (2) Invalid Amino Acid in Coding Region, NUMBER OF INVALID KEYS: 6 

L:48 M:336 W: Invalid Amino Acid Number in Coding Region, SEQ ID:1 

L:51 M:334 W: (2) Invalid Amino Acid in Coding Region, NUMBER' OF INVALID KEYS:6 

L:52 M:336 W: Invalid Amino Acid Number in Coding Region, SEQ ID:1 

L:53M:334W: (2) Invalid Amino Acid in Coding Region, NUMBER OF INVALID KEYS: 6 

L:54 M:336 W: Invalid Amino Acid Number in Coding Region, SEQ ID:1 

L:55 M:334 W: (2) Invalid Amino Acid in Coding Region, NUMBER OF INVALID KEYS: 6 

L:56 M:336 W: Invalid Amino Acid Number in Coding Region, SEQ ID:1 

L:59 M:334 W: (2) Invalid Amino Acid in Coding Region, NUMBER OF 

L:60 M:336 W: Invalid Amino Acid Number in Coding Region, SEQ ID 

L:61 M:334 W: (2) Invalid Amino Acid in Coding Region, NUMBER OF 

L:62 M:336 W: Invalid Amino Acid Number in Coding Region, SEQ ID 

L:63 M:334 W: (2) Invalid Amino Acid in Coding Region, NUMBER OF 

L:64 M:336 W: Invalid Amino Acid Number in Coding Region, SEQ ID 

L:65 M:334 W: (2) Invalid Amino Acid in Coding Region, NUMBER OF 

L:66 M:336 W: Invalid Amino Acid Number in Coding Region, SEQ ID:1 

L:69 M:334 W: (2) Invalid Amino Acid in Coding Region, NUMBER OF INVALID 

L:70 M:336 W: Invalid Amino Acid Number in Coding Region, SEQ ID:1 

L:71 M:334 W: (2) Invalid Amino Acid in Coding Region, NUMBER OF 

L:72 M:336 W: Invalid Amino Acid Number in Coding Region, SEQ ID 

L:75 M:334 W: (2) Invalid Amino Acid in Coding Region, NUMBER OF 

L:76 M:336 W: Invalid Amino Acid Number in Coding Region, SEQ ID 

L:79 M:334 W: (2) Invalid Amino Acid in Coding Region, NUMBER OF 

L:80 M:336 W: Invalid Amino Acid Number in Coding Region, SEQ ID:1 

L:81 M:334 W: (2) Invalid Amino Acid in Coding Region, NUMBER OF INVALID 

L:82 M:336 W: Invalid Amino Acid Number in Coding Region, SEQ ID:1 

L:83 M:334 W: (2) Invalid Amino Acid in Coding Region, NUMBER OF 

L:84 M:336 W: Invalid Amino Acid Number in Coding Region, SEQ ID 

L:85 M:334 W: (2) Invalid Amino Acid in Coding Region, NUMBER OF 

L:86 M:336 W: Invalid Amino Acid Number in Coding Region, SEQ ID 



INVALID KEYS: 6 
1 

INVALID KEYS: 6 
1 

INVALID KEYS: 6 
1 

INVALID KEYS: 6 



KEYS : 6 



INVALID KEYS: 6 
1 

INVALID KEYS: 6 
1 

INVALID KEYS: 6 



KEYS : 6 



INVALID KEYS: 6 
1 

INVALID KEYS: 6 
1 
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VERIFICATION SUMMARY DATE: 08/23/2001 

PATENT APPLICATION: US/09/913,159 TIME: 14:18:59 

Input Set : A:\P057760.app 

Output Set: N:\CRF3\08162001\I913159.raw 

L:89 M:334 W: (2) Invalid Amino Acid in Coding Region, NUMBER OF INVALID KEYS : 6 

L:90 M:336 W: Invalid Amino Acid Number in Coding Region, SEQ ID:1 

L:93M:334W: (2) Invalid Amino Acid in Coding Region, NUMBER OF INVALID KEYS: 6 

L:94 M:336 W: Invalid Amino Acid Number in Coding Region, SEQ ID:1 

L:95 M:334 W: (2) Invalid Amino Acid in Coding Region, NUMBER OF INVALID KEYS : 6 

L:96 M:336 W: Invalid Amino Acid Number in Coding Region, SEQ ID:1 

L:97 M:334 W: (2) Invalid Amino Acid in Coding Region, NUMBER OF INVALID KEYS : 6 

L:98 M:336 W: Invalid Amino Acid Number in Coding Region, SEQ ID:1 

L:99 M:334 W: (2) Invalid Amino Acid in Coding Region, NUMBER OF INVALID KEYS : 6 

L:100 M:336 W: Invalid Amino Acid Number in Coding Region, SEQ ID:1 

L:103 M:334 W: (2) Invalid Amino Acid in Coding Region, NUMBER OF INVALID KEYS : 6 

L:104 M:336 W: Invalid Amino Acid Number in Coding Region, SEQ ID:1 

L:107 M:334 W: (2) Invalid Amino Acid in Coding Region, NUMBER OF INVALID KEYS : 6 

L:108 M:336 W: Invalid Amino Acid Number in Coding Region, SEQ ID:1 

L:109 M:334 W: (2) Invalid Amino Acid in Coding Region, NUMBER OF INVALID KEYS : 6 

L:110 M:336 W: Invalid Amino Acid Number in Coding Region, SEQ ID:1 

L:lll M:334 W: (2) Invalid Amino Acid in Coding Region, NUMBER OF INVALID KEYS : 6 

L:112 M:33.6 W: Invalid Amino Acid Number in Coding Region, SEQ ID:1 

L:113 M:334 W: (2) Invalid Amino Acid in Coding Region, NUMBER OF INVALID KEYS : 6 

L:114 M:336 W: Invalid Amino Acid Number in Coding Region, SEQ ID:1 

L:115 M:334 W: (2) Invalid Amino Acid in Coding Region, NUMBER OF INVALID KEYS : 6 

L:116 M:336 W: Invalid Amino Acid Number in Coding Region, SEQ ID:1 

L:117 M:334 W: (2) Invalid Amino Acid in Coding Region, NUMBER OF INVALID KEYS : 6 

L:118 M:336 W: Invalid Amino Acid Number in Coding Region, SEQ ID:1 

L:123 M:334 W: (2) Invalid Amino Acid in Coding Region, NUMBER OF INVALID KEYS : 6 

L:124 M:336 W: Invalid Amino Acid Number in Coding Region, SEQ ID:1 

L:127 M:334 W: (2) Invalid Amino Acid in Coding Region, NUMBER OF INVALID KEYS : 6 

L:128 M:336 W: Invalid Amino Acid Number in Coding Region, SEQ ID:1 

L:129 M:334 W: (2) Invalid Amino Acid in Coding Region, NUMBER OF INVALID KEYS : 6 

L:130 M:336 W: Invalid Amino Acid Number in Coding Region, SEQ ID:1 

L:131 M:334 W: (2) Invalid Amino Acid in Coding Region, NUMBER OF INVALID KEYS : 6 

L:132 M:336 W: Invalid Amino Acid Number in Coding Region, SEQ ID:1 

L:135 M:334 W: (2) Invalid Amino Acid in Coding Region, NUMBER OF INVALID KEYS : 6 

L:136 M:336 W: Invalid Amino Acid Number in Coding Region, SEQ ID:1 

L:137 M:334 W: (2) Invalid Amino Acid in Coding Region, NUMBER OF INVALID KEYS : 6 

L:138 M:336 W: Invalid Amino Acid Number in Coding Region, SEQ ID:1 

L:139 M:334 W: (2) Invalid Amino Acid in Coding Region, NUMBER OF INVALID KEYS : 6 

L:140 M:336 W: Invalid Amino Acid Number in Coding Region, SEQ ID:1 

L:141 M:334 W: (2) Invalid Amino Acid in Coding Region, NUMBER OF INVALID KEYS : 6 

L:142 M:336 W: Invalid Amino Acid Number in Coding Region, SEQ ID:1 

L:143 M:334 W: (2) Invalid Amino Acid in Coding Region, NUMBER OF INVALID KEYS : 6 

L:144 M:336 W: Invalid Amino Acid Number in Coding Region, SEQ ID:1 

L:145 M:334 W: (2) Invalid Amino Acid in Coding Region, NUMBER OF INVALID KEYS : 6 

L:146 M:336 W: Invalid Amino Acid Number in Coding Region, SEQ ID:1 

L:147 M:334 W: (2) Invalid Amino Acid in Coding Region, NUMBER OF INVALID KEYS : 6 

L:148 M:336 W: Invalid Amino Acid Number in Coding Region, SEQ ID:1 

L:149 M:334 W; (2) Invalid Amino Acid in Coding Region, NUMBER OF INVALID KEYS : 6 

L:150 M:336 W: Invalid Amino Acid Number in Coding Region, SEQ ID:1 

L:153 M:334 W: (2) Invalid Amino Acid in Coding Region, NUMBER OF INVALID KEYS : 6 
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VERIFICATION SUMMARY 

PATENT APPLICATION: US/09/913,159 



DATE: 08/23/2001 
TIME: 14:18:59 



L:154 M:336 W 
L:155 M:334 W 
L:156 M:336 W 
L:161 M:334 W 
L:162 M:336 W 
L:163 M:334 W 
L:164 M:336 W 
L:350 M:252 E 
L:644 M:254 E 
M:254 Repeated 
L:646 M:252 E 
L: 667 M:254 E 
L:669 M:341 W 
M:254 Repeated 
L: 683 M: 254 E 
L: 697 M: 254 E 
L:711 M:254 E 
L:725 M:254 E 
L:748 M:254 E 
M:254 Repeated 
L: 819 M:252 E: 
L:857 M:254 E: 
M:254 Repeated 
L:1063 M:252 E 
L:1078 M:254 E 
M:254 Repeated 
L:1106 M:252 E 
L:1122 M:254 E 
M:254 Repeated 
L:1150 M:252 E 



Input Set : A:\P057760.app 

Output Set: N:\CRF3\08162001\I913159.raw 

Invalid Amino Acid Number in Coding Region, SEQ ID:1 

(2) Invalid Amino Acid in Coding Region, NUMBER OF INVALID KEYS : 6 

Invalid Amino Acid Number in Coding Region, SEQ ID:1 

(2) Invalid Amino Acid in Coding Region, NUMBER OF INVALID KEYS : 6 

Invalid Amino Acid Number in Coding Region, SEQ ID:1 

(2) Invalid Amino Acid in Coding Region, NUMBER OF INVALID KEYS : 6 

Invalid Amino Acid Number in Coding Region, SEQ ID:1 

No. of Seq. differs, <211>LENGTH : Input : 9709 Found: 2943 SEQ:1 

No. of Bases conflict, LENGTH: Input: 60 Counted: 4 SEQ: 3 

in SeqNo=3 

No. of Seq. differs, <211>LENGTH : Input : 107 Found: 51 SEQ: 3 
No. of Bases conflict, LENGTH : Input : 0 Counted: 50 SEQ: 4 
(46) "n" or "Xaa" used, for SEQ ID# : 4 
in SeqNo=4 

No. of Bases conflict, LENGTH : Input : 0 Counted: 17 SEQ: 5 
No. of Bases conflict, LENGTH : Input : 0 Counted: 17 SEQ: 6 
No. of Bases conflict, LENGTH : Input : 0 Counted: 17 SEQ: 7 
No. of Bases conflict, LENGTH : Input : 0 Counted: 17 SEQ: 8 
No. of Bases conflict, LENGTH: Input :0 Counted: 50 SEQ:9 
in SeqNo=9 

No. of Seq. differs, <211>LENGTH : Input : 214 8 Found:587 SEQ:9 
No. of Bases conflict, LENGTH: Input : 60 Counted: 2 SEQ: 10 
in SeqNo=10 

: No. of Seq. differs, <211>LENGTH: Input : 6229 Found:2504 SEQ:10 
: No. of Bases conflict, LENGTH: Input : 60 Counted:6 SEQ:11 
in SeqNo=ll 

: No. of Seq. differs, <211>LENGTH : Input : 860 Found: 197 SEQ: 11 
: No. of Bases conflict, LENGTH: Input : 60 Counted: 6 SEQ: 12 
in SeqNo-12 

; No. of Seq. differs, <211>LENGTH: Input : 870 Found: 207 SEQ: 12 
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